-
Notifications
You must be signed in to change notification settings - Fork 257
Release QAT example with NLS #3480
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Release QAT example with NLS #3480
Conversation
Signed-off-by: J. Pablo Muñoz <[email protected]> Co-authored-by: Yuan0320 <[email protected]>
Signed-off-by: J. Pablo Muñoz <[email protected]> Co-authored-by: Yuan0320 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the contribution and very extensive evaluation!
It's great to see an improvement on top of baseline with constant LoRA rank!
On a high level, it looks good for me. Most of the logic is implemented in the sample, changes in NNCF are minimized by extending FQ with LoRA.
I have a few remarks to make it better in terms of integration into NNCF.
One thing that is important for potential customers - total time to get the best checkpoint.
Could you please specify in the readme, how long was tuning and search stage in both cases?
examples/llm_compression/torch/qat_with_lora/NLSDownstreamTasks.md
Outdated
Show resolved
Hide resolved
examples/llm_compression/torch/qat_with_lora/NLSDownstreamTasks.md
Outdated
Show resolved
Hide resolved
Signed-off-by: J. Pablo Muñoz <[email protected]> Co-authored-by: Yuan0320 <[email protected]>
Signed-off-by: J. Pablo Muñoz <[email protected]>
Signed-off-by: J. Pablo Muñoz <[email protected]>
Signed-off-by: J. Pablo Muñoz <[email protected]> Co-authored-by: Yuan0320 <[email protected]>
examples/llm_compression/torch/qat_with_lora/NLSDownstreamTasks.md
Outdated
Show resolved
Hide resolved
examples/llm_compression/torch/qat_with_lora/NLSDownstreamTasks.md
Outdated
Show resolved
Hide resolved
Signed-off-by: J. Pablo Muñoz <[email protected]> Co-authored-by: Yuan0320 <[email protected]>
Signed-off-by: J. Pablo Muñoz <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor remarks
Co-authored-by: Lyalyushkin Nikolay <[email protected]>
Co-authored-by: Lyalyushkin Nikolay <[email protected]>
Co-authored-by: Lyalyushkin Nikolay <[email protected]>
45eb700
to
f8dd856
Compare
Signed-off-by: J. Pablo Muñoz <[email protected]>
f8dd856
to
8a5b7db
Compare
Signed-off-by: J. Pablo Muñoz <[email protected]> Co-authored-by: Yuan0320 <[email protected]>
b77ba4a
to
407fbb3
Compare
Signed-off-by: J. Pablo Muñoz <[email protected]> Co-authored-by: Yuan0320 <[email protected]>
870e9fe
to
28e8f71
Compare
Signed-off-by: J. Pablo Muñoz <[email protected]>
Signed-off-by: J. Pablo Muñoz <[email protected]> Co-authored-by: Yuan0320 <[email protected]>
Signed-off-by: J. Pablo Muñoz <[email protected]> Co-authored-by: Yuan0320 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's non-blocking comments, but would be nice to resolve them (possible in a separate PR) before code freeze in NNCF (EOW).
Signed-off-by: J. Pablo Muñoz <[email protected]>
Latest job for test examples passed - https://github.com/openvinotoolkit/nncf/actions/runs/15141407340/attempts/1 |
Signed-off-by: J. Pablo Muñoz <[email protected]>
Signed-off-by: J. Pablo Muñoz <[email protected]> Co-authored-by: Yuan0320 <[email protected]>
Co-authored-by: Yuan0320 <[email protected]>
Changes
Adds example to use NLS fine-tuning with quantization-aware LoRA on downstream tasks.
Reason for changes
To support fine-tuning for downstream scenarios, and NLS often boost the performance of LoRA fine-tuning on downstream tasks.
Related tickets
https://jira.devtools.intel.com/browse/CVS-166802
Tests
See the results in NLSDownstreamTasks.md. We have conducted extensive evaluation on 11 language models and 4 downstream tasks.
examples job: https://github.com/openvinotoolkit/nncf/actions/runs/14934370942